智能论文笔记

PoeticTTS -- Controllable Poetry Reading for Literary Studies

Julia Koch , Florian Lux , Nadja Schauffler , Toni Bernhart , Felix Dieterle , Jonas Kuhn , Sandra Richter , Gabriel Viehhauser , Ngoc Thang Vu

分类：自然语言处理 | 机器学习

2022-07-11

诗歌的语音综合是由于诗意语音固有的特定语调模式而具有挑战性的。在这项工作中，我们提出了一种将诗歌与几乎像人类一样自然的综合诗作的方法，以使文学学者能够系统地检查有关文本，口头实现和听众对诗歌的相互作用的假设。为了满足文学研究的这些特殊要求，我们通过从人类参考朗诵中克隆韵律价值来重新合成诗，然后利用细粒度的韵律控制来操纵在人类的环境中的合成语音以改变朗诵W.R.T.具体现象。我们发现，对诗歌的TTS模型进行鉴定会在很大程度上捕捉诗歌语调模式，这对韵律克隆和操纵是有益的，并在客观评估和人类研究中都验证了我们方法的成功。

translated by 谷歌翻译

Between welcome culture and border fence. A dataset on the European refugee crisis in German newspaper reports

Nico Blokker , André Blessing , Erenay Dayanik , Jonas Kuhn , Sebastian Padó , Gabriella Lapesa

分类：自然语言处理

2021-11-19

报纸报告提供有关关于特定政策领域的公开辩论的丰富信息来源，该领域可以作为政治科学探究的依据。这种辩论通常由关键事件引发，这引起了公众的关注和煽动政治行动者的反应：危机引发了辩论。但是，由于可靠的注释和建模的挑战，很少有很多具有高质量注释的大规模数据集。本文介绍了Debatenet2.0，它在2015年期间追溯了德国优质报纸Taz欧洲难民危机的政治话语。我们的注释的核心单位是政治索赔（请求在政策领域内采取的具体行动）和制定它们的演员（政治家，派对等）。本文的贡献是双重的。首先，我们与其同伴R包，Mardyr，通过与报纸上的政策辩论的诠释的实际和概念问题引导读者，将DebateneT2.0与其伴侣R封装联系起来。其次，我们概述并将话语网络分析（DNA）应用于Debatenet2.0，比较了对“难民危机”的政策辩论的两个至关重要的时刻：4月/ 5月的地中海的移民通量和沿巴尔干路线的迁移渠道9月/ 10月。除了释放的资源和案例研究外，我们的贡献也是方法论：我们通过报纸文章向话语网络的步骤讨论读者，表明德国迁移辩论不仅仅是一个话语网络，而是多个话语，取决于兴趣主题（政治行动者，政策领域，时间跨度）。

translated by 谷歌翻译

NISQ-ready community detection based on separation-node identification

Jonas Stein , Dominik Ott , Mirco Schoenfeld , Sebastian Feld

分类：机器学习

2022-12-30

The analysis of network structure is essential to many scientific areas, ranging from biology to sociology. As the computational task of clustering these networks into partitions, i.e., solving the community detection problem, is generally NP-hard, heuristic solutions are indispensable. The exploration of expedient heuristics has led to the development of particularly promising approaches in the emerging technology of quantum computing. Motivated by the substantial hardware demands for all established quantum community detection approaches, we introduce a novel QUBO based approach that only needs number-of-nodes many qubits and is represented by a QUBO-matrix as sparse as the input graph's adjacency matrix. The substantial improvement on the sparsity of the QUBO-matrix, which is typically very dense in related work, is achieved through the novel concept of separation-nodes. Instead of assigning every node to a community directly, this approach relies on the identification of a separation-node set, which -- upon its removal from the graph -- yields a set of connected components, representing the core components of the communities. Employing a greedy heuristic to assign the nodes from the separation-node sets to the identified community cores, subsequent experimental results yield a proof of concept. This work hence displays a promising approach to NISQ ready quantum community detection, catalyzing the application of quantum computers for the network structure analysis of large scale, real world problem instances.

translated by 谷歌翻译

Non-intrusive surrogate modelling using sparse random features with applications in crashworthiness analysis

Maternus Herold , Anna Veselovska , Jonas Jehle , Felix Krahmer

分类：机器学习 | (统计)机器学习

2022-12-30

Efficient surrogate modelling is a key requirement for uncertainty quantification in data-driven scenarios. In this work, a novel approach of using Sparse Random Features for surrogate modelling in combination with self-supervised dimensionality reduction is described. The method is compared to other methods on synthetic and real data obtained from crashworthiness analyses. The results show a superiority of the here described approach over state of the art surrogate modelling techniques, Polynomial Chaos Expansions and Neural Networks.

translated by 谷歌翻译

Restricting to the chip architecture maintains the quantum neural network accuracy, if the parameterization is a $2$-design

Lucas Friedrich , Jonas Maziero

分类：人工智能 | 机器学习

2022-12-29

In the era of noisy intermediate scale quantum devices, variational quantum circuits (VQCs) are currently one of the main strategies for building quantum machine learning models. These models are made up of a quantum part and a classical part. The quantum part is given by a parametrization $U$, which, in general, is obtained from the product of different quantum gates. By its turn, the classical part corresponds to an optimizer that updates the parameters of $U$ in order to minimize a cost function $C$. However, despite the many applications of VQCs, there are still questions to be answered, such as for example: What is the best sequence of gates to be used? How to optimize their parameters? Which cost function to use? How the architecture of the quantum chips influences the final results? In this article, we focus on answering the last question. We will show that, in general, the cost function will tend to a typical average value the closer the parameterization used is from a $2$-design. Therefore, the closer this parameterization is to a $2$-design, the less the result of the quantum neural network model will depend on its parametrization. As a consequence, we can use the own architecture of the quantum chips to defined the VQC parametrization, avoiding the use of additional swap gates and thus diminishing the VQC depth and the associated errors.

translated by 谷歌翻译

Cramming: Training a Language Model on a Single GPU in One Day

Jonas Geiping , Tom Goldstein

分类：自然语言处理 | 机器学习

2022-12-28

Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners. While most in the community are asking how to push the limits of extreme computation, we ask the opposite question: How far can we get with a single GPU in just one day? We investigate the downstream performance achievable with a transformer-based language model trained completely from scratch with masked language modeling for a single day on a single consumer GPU. Aside from re-analyzing nearly all components of the pretraining pipeline for this scenario and providing a modified pipeline with performance close to BERT, we investigate why scaling down is hard, and which modifications actually improve performance in this scenario. We provide evidence that even in this constrained setting, performance closely follows scaling laws observed in large-compute settings. Through the lens of scaling laws, we categorize a range of recent improvements to training and architecture and discuss their merit and practical applicability (or lack thereof) for the limited compute setting.

translated by 谷歌翻译

Continual Causal Abstractions

Matej Zečević , Moritz Willig , Jonas Seng , Florian Peter Busch

分类：人工智能

2022-12-23

This short paper discusses continually updated causal abstractions as a potential direction of future research. The key idea is to revise the existing level of causal abstraction to a different level of detail that is both consistent with the history of observed data and more effective in solving a given task.

translated by 谷歌翻译

A comprehensive analysis of the Elo rating algorithm: Stochastic model, convergence characteristics, design guidelines, and experimental results

Daniel Gomes de Pinho Zanco , Leszek Szczecinski , Eduardo Vinicius Kuhn , Rui Seara

分类：机器学习 | 人工智能

2022-12-22

The Elo algorithm, due to its simplicity, is widely used for rating in sports competitions as well as in other applications where the rating/ranking is a useful tool for predicting future results. However, despite its widespread use, a detailed understanding of the convergence properties of the Elo algorithm is still lacking. Aiming to fill this gap, this paper presents a comprehensive (stochastic) analysis of the Elo algorithm, considering round-robin (one-on-one) competitions. Specifically, analytical expressions are derived characterizing the behavior/evolution of the skills and of important performance metrics. Then, taking into account the relationship between the behavior of the algorithm and the step-size value, which is a hyperparameter that can be controlled, some design guidelines as well as discussions about the performance of the algorithm are provided. To illustrate the applicability of the theoretical findings, experimental results are shown, corroborating the very good match between analytical predictions and those obtained from the algorithm using real-world data (from the Italian SuperLega, Volleyball League).

translated by 谷歌翻译

ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models

Jonas Belouadi , Steffen Eger

分类：自然语言处理

2022-12-20

State-of-the-art poetry generation systems are often complex. They either consist of task-specific model pipelines, incorporate prior knowledge in the form of manually created constraints or both. In contrast, end-to-end models would not suffer from the overhead of having to model prior knowledge and could learn the nuances of poetry from data alone, reducing the degree of human supervision required. In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. We identify and address lack of training data and mismatching tokenization algorithms as possible limitations of past attempts. In particular, we successfully pre-train and release ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with our styles. We show that ByGPT5 outperforms other models such as mT5, ByT5, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans. In addition, we analyze its runtime performance and introspect the model's understanding of style conditions. We make our code, models, and datasets publicly available.

translated by 谷歌翻译

VoronoiPatches: Evaluating A New Data Augmentation Method

Steffen Illium , Gretchen Griffin , Michael Kölle , Maximilian Zorn , Jonas Nüßlein , Claudia Linnhoff-Popien

分类：计算机视觉 | 机器学习

2022-12-20

Overfitting is a problem in Convolutional Neural Networks (CNN) that causes poor generalization of models on unseen data. To remediate this problem, many new and diverse data augmentation methods (DA) have been proposed to supplement or generate more training data, and thereby increase its quality. In this work, we propose a new data augmentation algorithm: VoronoiPatches (VP). We primarily utilize non-linear recombination of information within an image, fragmenting and occluding small information patches. Unlike other DA methods, VP uses small convex polygon-shaped patches in a random layout to transport information around within an image. Sudden transitions created between patches and the original image can, optionally, be smoothed. In our experiments, VP outperformed current DA methods regarding model variance and overfitting tendencies. We demonstrate data augmentation utilizing non-linear re-combination of information within images, and non-orthogonal shapes and structures improves CNN model robustness on unseen data.

translated by 谷歌翻译